Identifying prognostic factors for survival in intensive care unit patients with SIRS or sepsis by machine learning analysis on electronic health records

Background Systemic inflammatory response syndrome (SIRS) and sepsis are the most common causes of in-hospital death. However, the characteristics associated with the improvement in the patient conditions during the ICU stay were not fully elucidated for each population as well as the possible differences between the two. Goal The aim of this study is to highlight the differences between the prognostic clinical features for the survival of patients diagnosed with SIRS and those of patients diagnosed with sepsis by using a multi-variable predictive modeling approach with a reduced set of easily available measurements collected at the admission to the intensive care unit (ICU). Methods Data were collected from 1,257 patients (816 non-sepsis SIRS and 441 sepsis) admitted to the ICU. We compared the performance of five machine learning models in predicting patient survival. Matthews correlation coefficient (MCC) was used to evaluate model performances and feature importance, and by applying Monte Carlo stratified Cross-Validation. Results Extreme Gradient Boosting (MCC = 0.489) and Logistic Regression (MCC = 0.533) achieved the highest results for SIRS and sepsis cohorts, respectively. In order of importance, APACHE II, mean platelet volume (MPV), eosinophil counts (EoC), and C-reactive protein (CRP) showed higher importance for predicting sepsis patient survival, whereas, SOFA, APACHE II, platelet counts (PLTC), and CRP obtained higher importance in the SIRS cohort. Conclusion By using complete blood count parameters as predictors of ICU patient survival, machine learning models can accurately predict the survival of SIRS and sepsis ICU patients. Interestingly, feature importance highlights the role of CRP and APACHE II in both SIRS and sepsis populations. In addition, MPV and EoC are shown to be important features for the sepsis population only, whereas SOFA and PLTC have higher importance for SIRS patients.


Introduction
Patient's outcome has long been used as primary endpoint for trials in critical care as well as for determining the patient's prognosis after treatments.Patient mortality and survival are indeed the major clinical outcomes, and they are main targets for assessing prognostic factors driving the patient conditions and the effectiveness of clinical interventions [1,2], especially in the intensive care unit (ICU) where admitted patients are usually in very critical conditions and require constant monitoring and treatment.
In this context, sepsis represents an important global health problem accounting for about one-third of ICU deaths and its reported incidence is still increasing [3][4][5][6] and a proper and precise description of sepsis is still not available.Indeed, the definition of sepsis was subjected to several revisions during the years [7][8][9], and according to the Third International Consensus Definitions for Sepsis and Septic Shock [9] it is currently defined as a life-threatening organ dysfunction caused by a dysregulated host response to infection.This last update of the sepsis definition abandons the use of Systemic Inflammatory Response Syndrome (SIRS) criteria, which were recognized as presenting a lack of specificity, whereas it focuses on the life-threatening condition and the presence and progression of organ failure.However, the major differences between SIRS and sepsis leading to a positive or negative patient outcome are still not fully elucidated in the medical literature.
In particular, Gucyetmez et al. [10] evaluated the ability of hemogram parameters, a set of medical laboratory tests providing information about the cells in a person's blood, and C-reactive protein (CRP) to distinguish non-sepsis SIRS from sepsis patients.The authors found that the combinations of CRP, lymphocytes count (LymC), and platelet count (PLTC) can be used to determine the likelihood of sepsis, however without exploring the association of these parameters with the patient survival for each population.This information can provide significant indications about the most important prognostic factors specifically for non-sepsis SIRS and sepsis patients.Also, the authors did not investigate the predictive power of the observed variables.
Especially for this last task, machine learning approaches have shown a good ability in the early identification of sepsis with data collected both from electronic health records [11][12][13] and from physiological vital signs monitoring [14], also providing insights about the role of each feature in a multi-variable setting.Several studies focused on predicting ICU patient outcomes focusing on mortality or survival prediction task [15][16][17][18][19][20][21][22], but multi-variable prognostic models estimating and comparing SIRS and sepsis outcomes are still lacking.The stratification of patients' risk in particular for patients undergoing infections and with sepsis is important.In fact, these patients often require prompt management and interventions, like the initiation of antibiotic therapy and the administration of fluid and vasopressors for maintaining adequate tissue perfusion and hemodynamic stability [23].These aspects led to an increasing interest toward the prediction of the patient outcome specifically for sepsis patients, in the last few years [24][25][26][27][28][29].
In this context, simple and easily available laboratory measurements of blood cell counts (for example platelet, eosinophil, neutrophil, and lymphocyte counts) can be useful tools for patients' risk stratification.
The goal of this study is to further explore the ability of hemogram parameters in estimating the survival of ICU patients with non-sepsis SIRS and with sepsis, by applying machine learning techniques in order to estimate the survival probability of ICU patients and to investigate the role of the different parameters in a multi-variable prediction setting.Specifically, our study makes further use of the features proposed by Gucyetmez and colleagues [10] to explore the predictive power of hemogram parameters in estimating the survival probability of patients with non-sepsis SIRS and with sepsis, by comparing different machine learning approaches.
Multi-variable feature importance of the best performing models is applied to further assess the role of each feature and to highlight differences between non-sepsis SIRS and sepsis cohorts.

Dataset
In this study, we use data retrospectively collected from 1,257 eligible medical and surgical patients admitted to the ICU's of Acibadem International Hospital and Atasehir Memorial Hospital between 1 January 2006 and 31 December 2013, Istanbul, Turkey, and made available by Gucyetmez et al. [10].The considered cohort includes 816 (64.9%) non-sepsis SIRS and 441 (35.1%) sepsis patients.
The dataset contains the following features for each patient: Age, sex, APACHE II and SOFA scores, diagnosis (medical, elective, and emergency surgery), length of ICU stay (LOS-ICU), mortality, CRP, WBCC, NeuC, LymC, NLCR, EoC, PLTC, MPV.A detailed description of the data here used is provided in Table 1.The target variable for our analysis was survival, indicating whether the patient survived (1) or died (0) in the ICU.A quantitative description of the distribution of each numeric and categorical feature for the non-sepsis SIRS (SIRS) and sepsis (SEPSIS) cohorts are reported in Tables 2 and 3. From the analysis of the target variable (Survival) it is possible to observe that both cohorts are unbalanced, with stronger unbalance in the SIRS (3.07%not survived) than in the SEPSIS (23.64% not survived) cohort.

Methods
In this retrospective study, we developed predictive models of patient survival using machine learning algorithms and we evaluated the importance of features associated with patient survival using machine learning and biostatistics approaches for both the SIRS and SEPSIS populations separately (Fig 1).All the analyses were performed with the Python 3.8.3programming language, and scikit-learn 1.0 and SciPy 1.7.1 software packages.Observations with missing information (three patients) were removed.

Associations between features and survival
The association between the input features and patient survival was also explored with classical statistical approaches.Specifically, differences in numeric features between survived and deceased groups in each cohort were tested with the Mann-Whitney U test, whereas differences in categorical features were assessed with χ 2 -test [30].Statistical significance was defined for p < 0.005 as advocated by Benjamin et al. [31], which also accounts for multiple testing adjustments.

Survival prediction models
We trained five machine learning models with the goal of predicting patient's survival considering the following features: Age, Sex, SOFA, APACHE II, CRP, WBCC, NeuC, LymC, EOC, NLCR, PLTC, and MPV, for both SIRS and SEPSIS cohorts.The approach consisted of 100 runs of Monte Carlo stratified Cross-Validation with 80%-20% train-test split as already proposed by Chicco et al. [32].At each iteration, 80% of the data were used as a training set and 20% as a test set keeping constant the ratio between survived and dead patients.In order to limit the effect of class imbalance, we applied Synthetic Minority Oversampling Technique (SMOTE) [33] to the training set.Features were rescaled before feeding them to the classifier by removing the median and dividing by the interquartile range, as estimated on the training set [37], and five machine learning classifiers were used to develop patient survival prediction models.We considered the following classifiers: Logistic Regression (LR), Support Vector Machine (SVM) [34], Decision Tree (Tree), Random Forest (RF) [35] and XGBoost (XGB) [36] (evalution metric: logloss and objective function: binary/logistic).To evaluate the performance of the classifiers, Matthews correlation coefficients (MCC) [38] on the cross-validated test sets  (11)(12)(13)(14)(15)(16)(17)(18)(19)(20) were considered because its proven ability to summarize results from contingency tables and its invariance to class swapping

Feature importance
The best-performing model for each cohort was selected and feature importance was estimated through single feature elimination (SFE) approach, that is by evaluating the MCC obtained after removing one variable at a time.In this case, the smaller the resulting MCC, the higher the importance of the variable which generated that observed drop in performances.Feature importance analysis was executed for 100 runs of Monte Carlo stratified cross-validation partitions with 80%/20% train-test split [42].The resulting MCCs for each run are obtained from the test set observations.Finally, we used the Spearman correlation coefficient and the Kendall coefficient [43] to quantify the correlation between the obtained ranks.Both coefficients range from -1 to 1 (from anticorrelation to perfect matching) whereas the absence of correlation is given by a 0 coefficient.

Associations between features and survival
Results of the statistical analysis are reported in The best score when predicting survival on the SIRS cohort was achieved with the XGB method that reached 0.489.The second and third best-performing models in the SIRS cohort were RF with MCC equal to 0.39 and LR showing MCC equal to 0.379.SVM and Tree showed scores equal to 0.378 and 0.289, respectively.Table 4. p-values obtained from the statistical analysis when testing associations with patient survival in the SEP-SIS and SIRS cohorts.Differences in numeric features between survived and deceased groups in the two cohorts were tested with the Mann-Whitney U test [44], whereas differences in categorical features were assessed with the χ 2 -test.

Feature ranking
This section describes the results obtained after the SFE approach performed on the models with the highest performance in the prediction task on each of the two cohorts.Median values and interquartile ranges for the resulting MCCs are reported in the supplementary material (Text E of S1 Appendix).A graphical representation of feature importance is shown in Fig 4a for the SEPSIS cohort and in Fig 4b for the SIRS cohort where features were ordered from lowest to the highest importance.
Specifically, APACHE II showed the highest importance, that is the lowest resulting median MCC equal to 0.436 (-0.097), in predicting SEPSIS patient survival with a Logistic Regression model.MPV ranked second in terms of feature importance for this specific cohort with MCC equal to 0.484.The other features did not induce a notable decrease in the model's performance.Feature ranking for the survival prediction of SIRS patients with XGB algorithm showed that SOFA has the highest importance with resulting MCCs equal to 0.381 (-0.108) when the feature is removed.
Results with Spearman coefficient and Kendall distance did not show a significant correlation between the two series, with correlation equal to -0.091 (p = 0.737) and -0.007 (p = 0.983), respectively.

Discussion
Gucyetmez et al. [10] collected the data used in this study for exploring the ability of hemogram and CRP in discriminating between SIRS and SEPSIS cohorts.However, the authors did not investigate the prognostic role of the selected features within each cohort, therefore, our study aimed to investigate more in detail the importance of these features and the possible https://doi.org/10.1371/journal.pdig.0000459.g002differences between the considered cohorts.Specifically, we performed the evaluation of the ability of hemogram parameters in predicting the survival of ICU patients diagnosed with SIRS or SEPSIS, using a set of parameters usually available in the patient clinical records.We used widely available features like patient sex, illness severity scores commonly measured and recorded at admission in the ICU, C-reactive protein, and blood cell count measurements.Patients' comorbidities were not available in the patient's records shared by Gucyetmez et al. despite they are commonly available in an ICU setting, which represents a significant lack of information.The developed models would have certainly benefited from more information about the patient's history and this could have led to a more precise identification of differences in the prognostic factors.Therefore, future studies will focus on the extension of these analyses on more complete data including patients' comorbidities.The survival prediction models were developed and tested on SIRS and SEPSIS cohorts, with better performances observed in SEPSIS cohort.
Specifically, among all trained ML models, linear-based models like LR and SVM showed higher performances in the SEPSIS cohort, whereas RF and XGB performed better on the SIRS cohort.
Although average calibration curves are sub-optimal, which is likely due to the reduced sample size, the best-performing models show improved calibration with respect to the worst ones as expected.This behavior suggests that a bigger population would allow for a proper calibration adjustment and translation of the model's output score to an even more precise individualized patient survival probability.Also, this approach might account for a possible covariate shift due to changes in patient characteristics, without the need for the development of a new model.Therefore, we do consider that the relationships between the available variables both intra-and inter-population can be considered a reliable multivariable comparison of the major factor predicting survival for both SIRS and sepsis patients.As it can be observed in Text F of S1 Appendix, the sensitivity analysis implementing the hyperparameter optimization shows results very close to those observed without hyperparameter optimization, thus highlighting the robustness of the proposed framework.
The feature importance analysis attributed the highest importance to APACHE II and SOFA scores for SEPSIS and SIRS cohorts, respectively, thus confirming the importance of a preliminary assessment of patients' risk at the admission in the ICU [45].This result is also confirmed by statistical analysis, as shown in Table 4.

SEPSIS cohort
Results on the SEPSIS cohort showed that MPV was the second most important variable in predicting survival.This result is in line with the observed association of a higher MPV with an increased mortality risk as well as its predictive role [46][47][48].Our analysis ranked EoC and CRP as third and fourth most important features.In the literature, a lower EoC has been associated with mortality in critically ill medical patients [49], in patients admitted with an exacerbation of chronic obstructive pulmonary disease [50] and in patients with pneumonia [51].Interestingly, although non-significant, our cohort showed an increase in EoC in deceased SEPSIS patients.An epidemiological study [52] pointed out that eosinophilia is a predictor of all-cause mortality and that an increased number of peripheral blood eosinophils may reflect an increased inflammatory response, resulting in tissue injury, a condition that may reflect our cohort.CRP was the fourth most important variable in our ML model.Of note, CRP had already shown the potential of being a predictor of survival of ICU patients [53], and more in general a predictor of mortality in ML frameworks [54].

SIRS cohort
Interestingly, the third most important variable in predicting survival for the SIRS cohort was platelet counts, with a smaller median value for the non-survived patients than for the survived group.Indeed, Vanderschueren et al. [55] found that Thrombocytopenia was associated with a higher risk of death in a septic cohort, in line with our results considering the definition of sepsis (sepsis-1) used in 2000 which only required two SIRS criteria.CRP ranked fourth in predicting patient survival with non-sepsis SIRS and similar considerations as for the SEPSIS cohort can be done, moreover, its importance in predicting survival of a non-sepsis SIRS cohort was already observed in animal studies [56].The fifth and sixth-ranked features were lymphocytes and eosinophils.In literature, Lymphocytes counts were found to be associated with increased mortality risk in general ICU patients [57], heart failure [58], and COVID-19 patients [59].Eosinophils count significantly differed between survived and deceased groups with an increase in the deceased one.Similar considerations as for the SEPSIS cohort can be done also for eosinophil counts, where we already pointed out that this apparently opposite behavior with respect to literature might be due to the specific cohort of our study with undergoing inflammatory response [52].

General considerations and applicability
The developed models show the ability to predict patient survival and specifically, this study can be considered as an important integration of the study performed by Gucyetmez et al. [10] so that once a patient with inflammatory response has been identified as septic or not the corresponding model can give us the possibility to immediately assess the likelihood of survival.Also, the feature importance analysis proposed in our study gives a clue on the main features that contributed to the developed cohort-specific score, and it suggests to clinicians which of the considered variables is more informative for a patient falling in the SIRS or SEPSIS cohort.It is important to notice that the SFE method presents some limitations when features are highly interdependent, since the contribution of a feature that is very important may still be underestimated due to the effect of other covariates that depend on it.
Finally, it is worth mentioning that we are not aware of whether these data were collected for administrative health reasons or whether they are commonly used for clinical practice, which might limit the general applicability of the results.However, the employment of data like these for scientific analyses based on computational intelligence can allow new scientific discoveries that otherwise would be impossible with traditional hospital technologies.
This study presents an original application of a statistical framework aimed at predicting patient survival.As the approach is mainly limited by the reduced sample size of the cohort, it is expected that a larger collection of data would allow for a more effective model calibration and optimization that would further improve the model generalizability, thus providing a more precise estimate of patient survival probability.

Conclusions
The proposed study applies an original machine learning paradigm for processing clinical information at admission in the ICU to predict patient survival.The proposed approach relies on a multi-variable predictive modeling approach based on information gathered at the ICU admission, and aimed at predicting the likelihood of patient survival for patients with SIRS and with SEPSIS.Results provide insights into the differences of the most relevant variables between the two groups.A Monte Carlo Cross-Validation procedure was further applied to have robust estimates of the obtained scores.The performed sensitivity analysis showed that results did not notably vary with hyperparameter tuning thus confirming the need for a larger cohort to advance to a fully calibrated deployable model.
In this context, Logisitic Regression and XGBoost algorithms are the best-performing models for SEPSIS and SIRS cohorts, respectively.Moreover, feature importance analysis revealed a high importance of APACHE II score and a comparable important role of C-reactive protein in both cohorts.Also, MPV and EoC were revealed to be important predictors of survival mainly in the SEPSIS cohort, whereas they showed a secondary role in the SIRS cohort.
SIRS cohort showed greater importance of SOFA and platelets count features which instead ranked last in SEPSIS.
Importantly, beyond Gucyetmez et al. [10] findings, the proposed framework addresses the question of whether a patient has sepsis or not, and our models give clinicians the possibility to estimate patient's survival, as well as to identify the most important features involved in the stratification of patients' risk with SIRS or SEPSIS, and that also led to the proposed survival estimates.
Of note, to our knowledge, this is the first study where the ability of hemogram parameters in predicting patient survival at the admission in the ICU and the role of the considered features are investigated to highlight differences between SIRS and SEPSIS patients.
[39][40][41]60].Specifically, the MCC can take values ranging from -1 to +1, where -1 represents the misclassification of all observations, 0 represents the random association, and 1 perfect classification.Average Receiver Operating Characteristic (ROC) curves and Precision-Recall curves (PRC) are also used to quantitatively assess the average model performances.Further details are reported in the supplementary material in Text A of S1 Appendix where additional sensitivity analyses summarizing model calibration on the test set (Text D of S1 Appendix) and the model performance with hyperparameters optimization (Text F of S1 Appendix) are also reported.

Fig 3
Fig 3 shows the median ROC and PRC curves for SIRS and SEPSIS cohorts as an overall summary of models' performances across all Monte Carlo runs.

Fig 2 .
Fig 2. Matthews correlation coefficients (MCC) of the different families of machine learning models in predictingSurvival for the SIRS and SEPSIS cohorts.Each violin plot shows the distribution of the data, whereas the small boxplot inside each violin plot shows the median and the first and third quartiles, the whiskers indicate 0.05 and 0.95 quantiles.

Fig 3 .
Fig 3. Receiver Operating Characteristic (ROC) curves for SIRS (panel (a)) and SEPSIS (panel (b)) cohorts showing the median performances of each model on the test sets generated during Monte Carlo cross-validation.Panels (c) and (d) depict Precision-Recall Curves (PRC) for SIRS and SEPSIS cohorts, respectively, showing the median performances of each model on the test sets generated during Monte Carlo cross-validation.https://doi.org/10.1371/journal.pdig.0000459.g003

4 .
Matthews correlation coefficients after single feature elimination for the Survival prediction task performed with (a) the Logistic Regression model in the SEPSIS cohort and (b) with the XGBoost model in the SIRS cohort.Features were ranked according to importance from left to right.Each violin plot shows the distribution of the data, whereas the small boxplot inside each violin plot shows the median and the first and third quartiles, the whiskers indicate 0.05 and 0.95 quantiles.https://doi.org/10.1371/journal.pdig.0000459.g004

Table 1 . Description, unit of measure and range of values of each available feature in the dataset
. EC: Elective, AC: Emergency, and M: Medical.E: Male and K: Female. https://doi.org/10.1371/journal.pdig.0000459.t001

Table 3 . Values, counts and percentages for each categorical variable of the dataset, stratified by Survival and for the full non-sepsis SIRS cohort
. DIAG.: Diagnosis, S: Survived, NS: Not Survived, EC: Elective, AC: Emergency, and M: Medical.E: Male, and K: Female.

Table 4 .
It can be observed that APACHE II and SOFA scores showed significant (p<0.0001)association with survival in both SIRS and SEPSIS cohorts.EoC resulted significantly associated (p<0.0001) with survival in SIRS cohort only.
Survival prediction performances for SEPSIS and SIRS cohorts are graphically summarized in Fig 2.Median MCCs, accuracy, sensitivity, specificity, F1-scores, positive predictive value, negative predictive value, areas under precision-recall and receiver operating characteristic curves, and the respective interquartile ranges are reported in the supplementary material (Text C of S1 Appendix).SVM and LR obtained the highest MCCs in predicting sepsis patient survival, that is 0.533 and 0.533, respectively.Random Forest performed as second best model in this cohort with MCC equal to 0.516 whereas XGBoost and Tree obtained the lowest results 0.459 and 0.368, respectively.LR was chosen as the best performing because of the highest third quartile.